pacman::p_load(arrow, lubridate, tidyverse, tmap, sf, spatstat, raster, spNetwork,sp)Take-home Exercise 1: Application of Spatial Point Patterns Analysis – Geographical Distribution of Grab hailing services (Singapore)
Context
Introduction
Human mobility, the movement of people in space and time, reflects the spatial-temporal characteristics of human behaviour (Wang et al., 2021). With the advancement Information and Communication Technologies (ICT), pertinent through the ubiquitous use of smartphones, a large and growing volume of data relating to human mobility is available today. By using appropriate GIS analysis methods, such data are potentially useful in supporting smart-city planning and management.
Data
In 2020, an interesting human mobility data set called Grab Posisi was released by GRAB, one of the largest shared taxi operators in Southeast Asia. One of the two data sets released is focusing on Singapore only.
In Singapore, relevant human mobility data can be extracted from Land Transport Authority (LTA) DataMall. In particular, two data sets related to human mobility are provided by the portal: Passenger Volume by Origin Destination Train Stations and Passenger Volume by Origin Destination Bus Stops.
Limitation
A limitation of the LTA data sets is that their location is biased to either bus stops or MRT/LRT stations.
Importing Packages
The following R packages are used for this assignment:
arrow, to read and write Parquet files (format in which data is in)
lubridate, to work with time-related data more easily
tidyverse, for importing, wrangling and visualising data
tmap, to create thematic maps
sf, for importing, managing and processing geospatial data
spatstat, for point pattern analysis
raster, reads, writes, manipulates, analyses and model of gridded spatial data (raster)
spNetwork, to perform kernel density estimation and build spatial matrices to conduct spatial analysis on reticular distances
Data
Geospatial Data
Data preprocessing
Master Plan 2019 Subzone Boundary
First, we want to get a clear Singapore boundary layer.
Extracted from data.gov.sg, let’s read the 2019 master plan subzone boundary layer.
mpsz <- st_read("data/geospatial/master-plan-2019-subzone-boundary-no-sea-kml.kml")
summary(mpsz)- The dimension contains XYZ, meaning that
mpszcontains 3D geometries
Convert mpsz to 2D geometry using st_zm() from sf and save the converted form as a shapefile.
mpsz <- st_zm(mpsz, drop = TRUE) # Convert 3D geometries to 2DCreate a new column SUBZONE_N containing the subzone names.
# create a new column 'SUBZONE_N' and extract subzone names from 'Dscrptn' field
mpsz <- mpsz %>%
rowwise() %>%
mutate(SUBZONE_N = str_extract(Description, "<th>SUBZONE_N</th> <td>(.*?)</td>")) %>%
ungroup()
# remove HTML tags and 'SUBZONE_N' from 'SUBZONE_N' column
mpsz$SUBZONE_N <- str_remove_all(mpsz$SUBZONE_N, "<.*?>|SUBZONE_N")
# view the updated 'mpsz3414' dataframe
head(mpsz$SUBZONE_N)Create a new column PLN_AREA_N containing the planning area names.
# create a new column 'PLN_AREA_N' and extract subzone names from 'Dscrptn' field
mpsz <- mpsz %>%
rowwise() %>%
mutate(PLN_AREA_N = str_extract(Description, "<th>PLN_AREA_N</th> <td>(.*?)</td>")) %>%
ungroup()
# remove HTML tags and 'PLN_AREA_N' from 'PLN_AREA_N' column
mpsz$PLN_AREA_N <- str_remove_all(mpsz$PLN_AREA_N, "<.*?>|PLN_AREA_N")# remove the 'Description' column using the $ operator
mpsz$Description <- NULL#st_write(mpsz, "data/geospatial/mpsz.shp")mpsz_sf <- st_read("data/geospatial/mpsz.shp")Reading layer `mpsz' from data source
`C:\guacodemoleh\IS415-GAA\Take-home_Ex\Take-home_Ex01\data\geospatial\mpsz.shp'
using driver `ESRI Shapefile'
Simple feature collection with 332 features and 3 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
class(mpsz_sf)[1] "sf" "data.frame"
Then transform the CRS from WGS84 to SVY21.
mpsz3414 <- st_transform(mpsz_sf, 3414)
class(mpsz3414)[1] "sf" "data.frame"
No commercial ride hailing services are available in the outer islands of Singapore. Let’s identify them first.

Filter out the outer islands.
outer_islands <- c("SEMAKAU", "SUDONG", "NORTH-EASTERN ISLANDS", "SOUTHERN GROUP")
# remove rows where 'SUBZONE_N' is in the list
mpsz3414 <- mpsz3414 %>%
filter(!str_trim(SUBZONE_N) %in% str_trim(outer_islands))
class(mpsz3414)[1] "sf" "data.frame"
Then dissolve all the inner subzone boundaries using st_union() (if necessary).
sg_sf <- mpsz3414 %>% st_union()
plot(sg_sf)
OSM Singapore Roads
Get the roads layer from OSM. Since the file is very large, it takes a long time to get the intersection between Singapore and roads (I took about 7 minutes to execute this step).
roads <- st_read(dsn="data/geospatial", layer="gis_osm_roads_free_1") %>%
st_transform(crs=3414)
sg_roads <- st_intersection(roads,sg_sf)
plot(sg_roads) Alternatively you may use QGIS to select the roads in Singapore and save the roads into a shape file.
The following two code chunks can be uncommented if you are taking this approach.
sg_roads <- st_read(dsn= "data/geospatial", layer="sg_roads")
str(sg_roads)Verify that roads are in the correct in CRS
st_crs(sg_roads) <- st_crs(3414)
st_crs(sg_roads)- The CRS ID is 3414 which is correct.
Data Size Management
Before further data manipulation, save the files in rds format as the size of roads3414 is very large.
write_rds(sg_roads,"data/geospatial/rds/sg_roads.rds")Read the written roads file.
sg_roads <- read_rds("data/geospatial/rds/sg_roads.rds")Aspatial Data
Data preprocessing
Read all the Grab-Posisi using arrow’s read_parquet() function to read the parquet files containing the ride trajectories.
df0 <- read_parquet("data/aspatial/part-00000-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")
df1 <- read_parquet("data/aspatial/part-00001-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")
df2 <- read_parquet("data/aspatial/part-00002-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")
df3 <- read_parquet("data/aspatial/part-00003-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")
df4 <- read_parquet("data/aspatial/part-00004-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")
df5 <- read_parquet("data/aspatial/part-00005-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")
df6 <- read_parquet("data/aspatial/part-00006-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")
df7 <- read_parquet("data/aspatial/part-00007-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")
df8 <- read_parquet("data/aspatial/part-00008-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")
df9 <- read_parquet("data/aspatial/part-00009-8bbff892-97d2-4011-9961-703e38972569.c000.snappy.parquet")Then combine the rows using the bindrows function in the dplyr package embedded in the tidyverse package.
df <- bind_rows(df0, df1, df2, df3, df4, df5, df6, df7, df8, df9)
head(df)The observed pingtimestamp is not easily comprehensible, thus it needs to be amended to a proper timestamp.
df$pingtimestamp <- as_datetime(df$pingtimestamp) ## $ to overwrite the variable in df
head(df$pingtimestamp)The data can be further segmented to two groups:
origin_dfrepresenting the starting locations of trips takendest_dfrepresenting the end locations of trips taken
Starting Locations
origin_df <- df %>%
group_by(trj_id) %>%
arrange(pingtimestamp) %>%
filter(row_number()==1) %>%
mutate(weekday = wday(pingtimestamp,
label=TRUE,
abbr=TRUE),
start_hr = factor(hour(pingtimestamp)),
day = factor(mday(pingtimestamp)))End Locations
dest_df <- df %>%
group_by(trj_id) %>%
arrange(desc(pingtimestamp)) %>%
filter(row_number()==1) %>%
mutate(weekday = wday(pingtimestamp,
label=TRUE,
abbr=TRUE),
end_hr = factor(hour(pingtimestamp)),
day = factor(mday(pingtimestamp)))Data Size Management
Before further data manipulation, save the files in rds format as the size of the data frames are very large. Running the whole tibble data frame will not be practical and the processing time will be very long.
write_rds(origin_df,"data/aspatial/rds/origin_df.rds")
write_rds(dest_df,"data/aspatial/rds/dest_df.rds")origin_df <- read_rds("data/aspatial/rds/origin_df.rds")
dest_df <- read_rds("data/aspatial/rds/dest_df.rds")Further Observation
Inspect the fields in origin_df and dest_df.
list(origin_df)[[1]]
# A tibble: 28,000 × 12
# Groups: trj_id [28,000]
trj_id driving_mode osname pingtimestamp rawlat rawlng speed bearing
<chr> <chr> <chr> <dttm> <dbl> <dbl> <dbl> <int>
1 70895 car android 2019-04-08 00:09:26 1.38 104. 9.95 111
2 21926 car android 2019-04-08 00:09:48 1.29 104. 11.0 75
3 47498 car ios 2019-04-08 00:09:50 1.38 104. 18.6 307
4 18103 car android 2019-04-08 00:09:55 1.45 104. 0.404 159
5 41322 car android 2019-04-08 00:09:57 1.28 104. 17.9 232
6 64813 car ios 2019-04-08 00:10:03 1.31 104. 17.1 106
7 81518 car ios 2019-04-08 00:10:14 1.31 104. 6.24 213
8 66542 car android 2019-04-08 00:11:17 1.36 104. 9.11 179
9 25201 car ios 2019-04-08 00:12:05 1.37 104. 12.0 211
10 82401 car android 2019-04-08 00:12:11 1.30 104. 10.6 107
# ℹ 27,990 more rows
# ℹ 4 more variables: accuracy <dbl>, weekday <ord>, start_hr <fct>, day <fct>
list(dest_df)[[1]]
# A tibble: 28,000 × 12
# Groups: trj_id [28,000]
trj_id driving_mode osname pingtimestamp rawlat rawlng speed bearing
<chr> <chr> <chr> <dttm> <dbl> <dbl> <dbl> <int>
1 81574 car ios 2019-04-21 23:56:49 1.34 104. 15.3 103
2 54687 car android 2019-04-21 23:56:46 1.44 104. 8.15 299
3 17190 car android 2019-04-21 23:56:36 1.34 104. 12.4 202
4 13793 car android 2019-04-21 23:56:30 1.32 104. 6.47 170
5 39014 car ios 2019-04-21 23:56:27 1.33 104. 3.59 169
6 41170 car ios 2019-04-21 23:56:13 1.32 104. 13.1 71
7 64519 car ios 2019-04-21 23:55:49 1.43 104. 14.3 239
8 70461 car ios 2019-04-21 23:55:32 1.29 104. 0.970 51
9 41154 car ios 2019-04-21 23:55:10 1.32 104. 10.8 118
10 65488 car ios 2019-04-21 23:54:47 1.38 104. 0 244
# ℹ 27,990 more rows
# ℹ 4 more variables: accuracy <dbl>, weekday <ord>, end_hr <fct>, day <fct>
Creating a Simple Feature Data Frame from an Aspatial Data Frame
Convert the data frames into simple features data frame such that the locations can be plotted.
Locations
Starting Locations
origin_sf <- origin_df %>%
st_as_sf(coords=c("rawlng","rawlat"),
crs=4326) %>%
st_transform(crs = 3414)End Locations
dest_sf <- dest_df %>%
st_as_sf(coords=c("rawlng","rawlat"),
crs=4326) %>%
st_transform(crs = 3414)Preparation for Spatial Point Patterns Analysis
Converting sf Format into spatstat’s ppp Format
Use as.ppp() of spatstat to convert the sf object ppp format.
Starting Locations
origin_ppp <- as.ppp(st_coordinates(origin_sf), st_bbox(origin_sf))
plot(origin_ppp)
Ending Locations
dest_ppp <- as.ppp(st_coordinates(dest_sf), st_bbox(dest_sf))
plot(dest_ppp)
Look at summary statistics.
Starting Locations
summary(origin_ppp)Planar point pattern: 28000 points
Average intensity 2.473666e-05 points per square unit
Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units
Window: rectangle = [3628.24, 49845.23] x [25198.14, 49689.64] units
(46220 x 24490 units)
Window area = 1131920000 square units
Ending Locations
summary(dest_ppp)Planar point pattern: 28000 points
Average intensity 2.493661e-05 points per square unit
Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units
Window: rectangle = [3637.21, 49870.63] x [25221.3, 49507.79] units
(46230 x 24290 units)
Window area = 1122850000 square units
any(duplicated(origin_ppp))[1] FALSE
any(duplicated(dest_ppp))[1] FALSE
- With no duplicated starting locations nor ending locations, we can proceed to set up for our spatial point patterns analysis.
Creating owin object
sg_owin <- as.owin(sg_sf)
plot(sg_owin)
summary(sg_owin)Window: polygonal boundary
56 separate polygons (36 holes)
vertices area relative.area
polygon 1 15307 7.00834e+08 9.92e-01
polygon 2 285 1.61128e+06 2.28e-03
polygon 3 27 1.50315e+04 2.13e-05
polygon 4 (hole) 41 -4.01660e+04 -5.69e-05
polygon 5 (hole) 317 -5.11280e+04 -7.24e-05
polygon 6 (hole) 3 -4.14099e-04 -5.86e-13
polygon 7 30 2.80002e+04 3.97e-05
polygon 8 (hole) 4 -2.86396e-01 -4.06e-10
polygon 9 (hole) 3 -1.81439e-04 -2.57e-13
polygon 10 (hole) 3 -8.68789e-04 -1.23e-12
polygon 11 (hole) 3 -5.99535e-04 -8.49e-13
polygon 12 (hole) 3 -3.04561e-04 -4.31e-13
polygon 13 (hole) 3 -4.46076e-04 -6.32e-13
polygon 14 (hole) 3 -3.39794e-04 -4.81e-13
polygon 15 (hole) 3 -4.52043e-05 -6.40e-14
polygon 16 (hole) 3 -3.90173e-05 -5.53e-14
polygon 17 (hole) 3 -9.59850e-05 -1.36e-13
polygon 18 (hole) 4 -2.54488e-04 -3.60e-13
polygon 19 (hole) 4 -4.28453e-01 -6.07e-10
polygon 20 (hole) 4 -2.18616e-04 -3.10e-13
polygon 21 (hole) 5 -2.44411e-04 -3.46e-13
polygon 22 (hole) 5 -3.64686e-02 -5.16e-11
polygon 23 71 8.18750e+03 1.16e-05
polygon 24 (hole) 6 -8.37554e-01 -1.19e-09
polygon 25 (hole) 38 -7.79904e+03 -1.10e-05
polygon 26 (hole) 3 -3.41897e-05 -4.84e-14
polygon 27 (hole) 3 -3.65499e-03 -5.18e-12
polygon 28 (hole) 3 -4.95057e-02 -7.01e-11
polygon 29 91 1.49663e+04 2.12e-05
polygon 30 (hole) 3 -3.79135e-02 -5.37e-11
polygon 31 (hole) 5 -2.92235e-04 -4.14e-13
polygon 32 (hole) 3 -7.43616e-06 -1.05e-14
polygon 33 (hole) 270 -1.21455e+03 -1.72e-06
polygon 34 (hole) 19 -4.39650e+00 -6.23e-09
polygon 35 (hole) 35 -1.38385e+02 -1.96e-07
polygon 36 (hole) 23 -1.99656e+01 -2.83e-08
polygon 37 40 1.38607e+04 1.96e-05
polygon 38 (hole) 41 -6.00381e+03 -8.50e-06
polygon 39 (hole) 7 -1.40546e-01 -1.99e-10
polygon 40 (hole) 11 -8.36705e+01 -1.18e-07
polygon 41 (hole) 3 -2.33435e-03 -3.31e-12
polygon 42 45 2.51218e+03 3.56e-06
polygon 43 139 3.22293e+03 4.56e-06
polygon 44 148 3.10395e+03 4.40e-06
polygon 45 (hole) 4 -1.72650e-04 -2.44e-13
polygon 46 75 1.73526e+04 2.46e-05
polygon 47 83 5.28920e+03 7.49e-06
polygon 48 106 3.04104e+03 4.31e-06
polygon 49 264 1.50631e+06 2.13e-03
polygon 50 71 5.63061e+03 7.97e-06
polygon 51 10 1.99717e+02 2.83e-07
polygon 52 (hole) 3 -1.37223e-02 -1.94e-11
polygon 53 487 2.06117e+06 2.92e-03
polygon 54 65 8.42861e+04 1.19e-04
polygon 55 47 3.82087e+04 5.41e-05
polygon 56 22 6.74651e+03 9.55e-06
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units
(53270 x 28810 units)
Window area = 706156000 square units
Fraction of frame area: 0.46
Combining point events object and owin object
Extract points that are located within Singapore
Starting Locations
originSG_ppp = origin_ppp[sg_owin]Ending Locations
destSG_ppp = dest_ppp[sg_owin]The output object contains the combination of the point and polygon feature into one ppp object class.
Starting Locations
summary(originSG_ppp)Planar point pattern: 28000 points
Average intensity 3.965129e-05 points per square unit
Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units
Window: polygonal boundary
56 separate polygons (36 holes)
vertices area relative.area
polygon 1 15307 7.00834e+08 9.92e-01
polygon 2 285 1.61128e+06 2.28e-03
polygon 3 27 1.50315e+04 2.13e-05
polygon 4 (hole) 41 -4.01660e+04 -5.69e-05
polygon 5 (hole) 317 -5.11280e+04 -7.24e-05
polygon 6 (hole) 3 -4.14099e-04 -5.86e-13
polygon 7 30 2.80002e+04 3.97e-05
polygon 8 (hole) 4 -2.86396e-01 -4.06e-10
polygon 9 (hole) 3 -1.81439e-04 -2.57e-13
polygon 10 (hole) 3 -8.68789e-04 -1.23e-12
polygon 11 (hole) 3 -5.99535e-04 -8.49e-13
polygon 12 (hole) 3 -3.04561e-04 -4.31e-13
polygon 13 (hole) 3 -4.46076e-04 -6.32e-13
polygon 14 (hole) 3 -3.39794e-04 -4.81e-13
polygon 15 (hole) 3 -4.52043e-05 -6.40e-14
polygon 16 (hole) 3 -3.90173e-05 -5.53e-14
polygon 17 (hole) 3 -9.59850e-05 -1.36e-13
polygon 18 (hole) 4 -2.54488e-04 -3.60e-13
polygon 19 (hole) 4 -4.28453e-01 -6.07e-10
polygon 20 (hole) 4 -2.18616e-04 -3.10e-13
polygon 21 (hole) 5 -2.44411e-04 -3.46e-13
polygon 22 (hole) 5 -3.64686e-02 -5.16e-11
polygon 23 71 8.18750e+03 1.16e-05
polygon 24 (hole) 6 -8.37554e-01 -1.19e-09
polygon 25 (hole) 38 -7.79904e+03 -1.10e-05
polygon 26 (hole) 3 -3.41897e-05 -4.84e-14
polygon 27 (hole) 3 -3.65499e-03 -5.18e-12
polygon 28 (hole) 3 -4.95057e-02 -7.01e-11
polygon 29 91 1.49663e+04 2.12e-05
polygon 30 (hole) 3 -3.79135e-02 -5.37e-11
polygon 31 (hole) 5 -2.92235e-04 -4.14e-13
polygon 32 (hole) 3 -7.43616e-06 -1.05e-14
polygon 33 (hole) 270 -1.21455e+03 -1.72e-06
polygon 34 (hole) 19 -4.39650e+00 -6.23e-09
polygon 35 (hole) 35 -1.38385e+02 -1.96e-07
polygon 36 (hole) 23 -1.99656e+01 -2.83e-08
polygon 37 40 1.38607e+04 1.96e-05
polygon 38 (hole) 41 -6.00381e+03 -8.50e-06
polygon 39 (hole) 7 -1.40546e-01 -1.99e-10
polygon 40 (hole) 11 -8.36705e+01 -1.18e-07
polygon 41 (hole) 3 -2.33435e-03 -3.31e-12
polygon 42 45 2.51218e+03 3.56e-06
polygon 43 139 3.22293e+03 4.56e-06
polygon 44 148 3.10395e+03 4.40e-06
polygon 45 (hole) 4 -1.72650e-04 -2.44e-13
polygon 46 75 1.73526e+04 2.46e-05
polygon 47 83 5.28920e+03 7.49e-06
polygon 48 106 3.04104e+03 4.31e-06
polygon 49 264 1.50631e+06 2.13e-03
polygon 50 71 5.63061e+03 7.97e-06
polygon 51 10 1.99717e+02 2.83e-07
polygon 52 (hole) 3 -1.37223e-02 -1.94e-11
polygon 53 487 2.06117e+06 2.92e-03
polygon 54 65 8.42861e+04 1.19e-04
polygon 55 47 3.82087e+04 5.41e-05
polygon 56 22 6.74651e+03 9.55e-06
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units
(53270 x 28810 units)
Window area = 706156000 square units
Fraction of frame area: 0.46
Ending Locations
summary(destSG_ppp)Planar point pattern: 27997 points
Average intensity 3.964704e-05 points per square unit
Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units
Window: polygonal boundary
56 separate polygons (36 holes)
vertices area relative.area
polygon 1 15307 7.00834e+08 9.92e-01
polygon 2 285 1.61128e+06 2.28e-03
polygon 3 27 1.50315e+04 2.13e-05
polygon 4 (hole) 41 -4.01660e+04 -5.69e-05
polygon 5 (hole) 317 -5.11280e+04 -7.24e-05
polygon 6 (hole) 3 -4.14099e-04 -5.86e-13
polygon 7 30 2.80002e+04 3.97e-05
polygon 8 (hole) 4 -2.86396e-01 -4.06e-10
polygon 9 (hole) 3 -1.81439e-04 -2.57e-13
polygon 10 (hole) 3 -8.68789e-04 -1.23e-12
polygon 11 (hole) 3 -5.99535e-04 -8.49e-13
polygon 12 (hole) 3 -3.04561e-04 -4.31e-13
polygon 13 (hole) 3 -4.46076e-04 -6.32e-13
polygon 14 (hole) 3 -3.39794e-04 -4.81e-13
polygon 15 (hole) 3 -4.52043e-05 -6.40e-14
polygon 16 (hole) 3 -3.90173e-05 -5.53e-14
polygon 17 (hole) 3 -9.59850e-05 -1.36e-13
polygon 18 (hole) 4 -2.54488e-04 -3.60e-13
polygon 19 (hole) 4 -4.28453e-01 -6.07e-10
polygon 20 (hole) 4 -2.18616e-04 -3.10e-13
polygon 21 (hole) 5 -2.44411e-04 -3.46e-13
polygon 22 (hole) 5 -3.64686e-02 -5.16e-11
polygon 23 71 8.18750e+03 1.16e-05
polygon 24 (hole) 6 -8.37554e-01 -1.19e-09
polygon 25 (hole) 38 -7.79904e+03 -1.10e-05
polygon 26 (hole) 3 -3.41897e-05 -4.84e-14
polygon 27 (hole) 3 -3.65499e-03 -5.18e-12
polygon 28 (hole) 3 -4.95057e-02 -7.01e-11
polygon 29 91 1.49663e+04 2.12e-05
polygon 30 (hole) 3 -3.79135e-02 -5.37e-11
polygon 31 (hole) 5 -2.92235e-04 -4.14e-13
polygon 32 (hole) 3 -7.43616e-06 -1.05e-14
polygon 33 (hole) 270 -1.21455e+03 -1.72e-06
polygon 34 (hole) 19 -4.39650e+00 -6.23e-09
polygon 35 (hole) 35 -1.38385e+02 -1.96e-07
polygon 36 (hole) 23 -1.99656e+01 -2.83e-08
polygon 37 40 1.38607e+04 1.96e-05
polygon 38 (hole) 41 -6.00381e+03 -8.50e-06
polygon 39 (hole) 7 -1.40546e-01 -1.99e-10
polygon 40 (hole) 11 -8.36705e+01 -1.18e-07
polygon 41 (hole) 3 -2.33435e-03 -3.31e-12
polygon 42 45 2.51218e+03 3.56e-06
polygon 43 139 3.22293e+03 4.56e-06
polygon 44 148 3.10395e+03 4.40e-06
polygon 45 (hole) 4 -1.72650e-04 -2.44e-13
polygon 46 75 1.73526e+04 2.46e-05
polygon 47 83 5.28920e+03 7.49e-06
polygon 48 106 3.04104e+03 4.31e-06
polygon 49 264 1.50631e+06 2.13e-03
polygon 50 71 5.63061e+03 7.97e-06
polygon 51 10 1.99717e+02 2.83e-07
polygon 52 (hole) 3 -1.37223e-02 -1.94e-11
polygon 53 487 2.06117e+06 2.92e-03
polygon 54 65 8.42861e+04 1.19e-04
polygon 55 47 3.82087e+04 5.41e-05
polygon 56 22 6.74651e+03 9.55e-06
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units
(53270 x 28810 units)
Window area = 706156000 square units
Fraction of frame area: 0.46
Plot the before and after to compare the outputs Starting Locations
par(mfrow=c(1,2))
par(mar = c(3,3,2,1))
plot(origin_ppp)
plot(originSG_ppp)
Ending Locations
par(mfrow=c(1,2))
par(mar = c(3,3,2,1))
plot(dest_ppp)
plot(destSG_ppp)
Exploratory Data Analysis
From the raster images, it is difficult to determine which parts of Singapore had higher demand for pick-up and drop-off.
Additional Data Consolidation
Add a count column for starting and ending locations in mpsz3414
# form common identifier by passing SUBZONE_N and PLN_AREA_N over
joined_origin <- st_join(origin_sf, mpsz3414)
joined_dest <- st_join(dest_sf, mpsz3414)
# create count number by subzone
origin_counts <- joined_origin %>%
group_by(SUBZONE_N) %>%
summarise(total_origin = n())
dest_counts <- joined_dest %>%
group_by(SUBZONE_N) %>%
summarise(total_dest = n())
# insert the count columns into mpsz3414
mpsz3414 <- st_join(mpsz3414, origin_counts, by = "SUBZONE_N")
mpsz3414 <- st_join(mpsz3414, dest_counts, by = "SUBZONE_N")
# remove the duplicated subzone columns
mpsz3414 <- mpsz3414[, !names(mpsz3414) %in% c("SUBZONE_N.x", "SUBZONE_N.y")]First let’s check for any null values.
any(is.na(mpsz3414$total_origin))[1] TRUE
any(is.na(mpsz3414$total_dest))[1] TRUE
Fill NA with zero using coalesce() from dplyr.
mpsz3414 <- mpsz3414 %>%
mutate(total_origin = coalesce(total_origin, 0),
prop_origin = total_origin / sum(total_origin))
mpsz3414 <- mpsz3414 %>%
mutate(total_dest = coalesce(total_dest, 0),
prop_dest = total_dest / sum(total_dest))any(is.na(mpsz3414$total_origin))[1] FALSE
any(is.na(mpsz3414$total_dest))[1] FALSE
class(mpsz3414)[1] "sf" "data.frame"
- Now there are no null values.
Since I have added critical data in mpsz3414, it is easier to plot the choropleth from the it. However, mpsz3414 is missing the geometry attribute. So let’s insert it from mpsz_sf.
# reorder columns for readability
mpsz3414 <- mpsz3414[c("SUBZONE_N", "PLN_AREA_N", "Name", "total_origin", "prop_origin", "total_dest", "prop_dest", "geometry")]Choropleth Maps
Now, let’s have a general overview of the distribution of pick-up and drop-off points.
Starting Locations
tmap_options(check.and.fix = TRUE)
origin_choropleth <-
tm_shape(mpsz3414) +
tm_fill("prop_origin",
n=10,
title="Proportion",
style="equal",
palette="YlOrRd") +
tm_borders(lwd=0.2,
alpha=1) +
tm_layout(main.title = "Distribution of Pick-up Points",
legend.outside=FALSE,
main.title.size=1)Ending Locations
dest_choropleth <-
tm_shape(mpsz3414) +
tm_fill("prop_dest",
n=10,
title="Proportion",
style="equal",
palette="YlOrRd") +
tm_borders(lwd=0.2,
alpha=1) +
tm_layout(main.title = "Distribution of Drop-off Points",
legend.outside=FALSE,
main.title.size=1) Comparing Starting and Ending Locations
tmap_mode("plot")
tmap_arrange(origin_choropleth, dest_choropleth, nrow=1)
- At a quick glance, we can observe that there are the most pick-up and drop-off locations at the Changi Airport subzone. The quick inference is that there are many travellers commuting to and from the airport via Grab ride-hailing services.
Locations with top proportions
Sort prop_origin in descending order using desc().
origin_top <- mpsz3414 %>%
arrange(desc(prop_origin)) %>%
slice(1:10)
origin_top <- origin_top[, c("SUBZONE_N", "PLN_AREA_N", "total_origin", "prop_origin")]
origin_topSimple feature collection with 10 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 11411.94 ymin: 31874.91 xmax: 50271.73 ymax: 47996.47
Projected CRS: SVY21 / Singapore TM
SUBZONE_N PLN_AREA_N total_origin prop_origin
1 CHANGI AIRPORT CHANGI 1456 0.05200000
2 XILIN TAMPINES 660 0.02357143
3 TAMPINES EAST TAMPINES 600 0.02142857
4 WOODLANDS EAST WOODLANDS 474 0.01692857
5 SIMEI TAMPINES 418 0.01492857
6 YUNNAN JURONG WEST 403 0.01439286
7 WOODLANDS WEST WOODLANDS 367 0.01310714
8 WOODGROVE WOODLANDS 362 0.01292857
9 ALJUNIED GEYLANG 305 0.01089286
10 WOODLANDS SOUTH WOODLANDS 305 0.01089286
geometry
1 MULTIPOLYGON (((45818.88 41...
2 MULTIPOLYGON (((44832.31 33...
3 MULTIPOLYGON (((42196.76 38...
4 MULTIPOLYGON (((24786.75 46...
5 MULTIPOLYGON (((42267.54 36...
6 MULTIPOLYGON (((12670.56 35...
7 MULTIPOLYGON (((21900.33 45...
8 MULTIPOLYGON (((22694.69 46...
9 MULTIPOLYGON (((34449.13 33...
10 MULTIPOLYGON (((24091.47 46...
Sort prop_dest in descending order using desc().
dest_top <- mpsz3414 %>%
arrange(desc(prop_dest)) %>%
slice(1:10)
dest_top <- dest_top[, c("SUBZONE_N", "PLN_AREA_N", "total_dest", "prop_dest")]
dest_topSimple feature collection with 10 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 20830.46 ymin: 31874.91 xmax: 50271.73 ymax: 46337.37
Projected CRS: SVY21 / Singapore TM
SUBZONE_N PLN_AREA_N total_dest prop_dest
1 CHANGI AIRPORT CHANGI 1633 0.05832768
2 TAMPINES EAST TAMPINES 496 0.01771618
3 XILIN TAMPINES 409 0.01460871
4 CENTRAL WATER CATCHMENT CENTRAL WATER CATCHMENT 406 0.01450155
5 WOODGROVE WOODLANDS 350 0.01250134
6 KAMPONG JAVA KALLANG 340 0.01214416
7 SWISS CLUB BUKIT TIMAH 330 0.01178698
8 ALJUNIED GEYLANG 326 0.01164410
9 BENDEMEER KALLANG 296 0.01057256
10 HILLCREST BUKIT TIMAH 296 0.01057256
geometry
1 MULTIPOLYGON (((45818.88 41...
2 MULTIPOLYGON (((42196.76 38...
3 MULTIPOLYGON (((44832.31 33...
4 MULTIPOLYGON (((25073.29 43...
5 MULTIPOLYGON (((22694.69 46...
6 MULTIPOLYGON (((30089.09 32...
7 MULTIPOLYGON (((24031.39 36...
8 MULTIPOLYGON (((34449.13 33...
9 MULTIPOLYGON (((31277.37 34...
10 MULTIPOLYGON (((25392.54 35...
After Changi, planning areas like Tampines- and Woodlands-related subzones are in the top 10 for the proportions of pick-up and drop-off locations.
Interestingly, Central Water Catchment planning area is one of the top few drop-off locations.
Tampines
The Tampines planning area has piqued my interest. Let’s have a closer look.
tampines <- mpsz3414[mpsz3414$PLN_AREA_N == 'TAMPINES', ]
roads_tampines <- sg_roads[st_intersection(sg_roads, tampines), ]
origin_tampines <- joined_origin[st_intersection(joined_origin, tampines), ]
tmap_mode("view")
tm_basemap("OpenStreetMap")+
tm_shape(tampines) +
tm_borders(lwd=0.5) +
tm_shape(roads_tampines) +
tm_lines() +
tm_shape(origin_tampines) +
tm_dots()tmap_mode("plot")Woodlands
woodlands <- mpsz3414[mpsz3414$PLN_AREA_N == 'WOODLANDS', ]
roads_woodlands <- sg_roads[st_intersection(sg_roads, woodlands), ]
origin_woodlands <- joined_origin[st_intersection(joined_origin, woodlands), ]
tmap_mode("view")
tm_basemap("OpenStreetMap")+
tm_shape(woodlands) +
tm_borders(lwd=0.5) +
tm_shape(roads_woodlands) +
tm_lines() +
tm_shape(origin_woodlands) +
tm_dots()tmap_mode("plot")Central Water Catchement (CWC)
Since CWC area popped up as one of the top ten drop off locations, we should specify joined dest.
cwc <- mpsz3414[mpsz3414$PLN_AREA_N == 'CENTRAL WATER CATCHMENT', ]
roads_cwc <- sg_roads[st_intersection(sg_roads, cwc), ]
dest_cwc <- joined_dest[st_intersection(joined_dest, cwc), ]
tmap_mode("view")
tm_basemap("OpenStreetMap")+
tm_shape(cwc) +
tm_borders(lwd=0.5) +
tm_shape(roads_cwc) +
tm_lines() +
tm_shape(dest_cwc) +
tm_dots()Many of the drop-off points are along the parameters of the CWC planning area!
tmap_mode("plot")Downtown Core
Let’s explore other areas like CBD as my assumption is that office workers might use Grab to travel to their workplaces. From prior knowledge, CBD is within the planning area of Downtown Core.
Starting Locations
downtown_core <- mpsz3414[mpsz3414$PLN_AREA_N == 'DOWNTOWN CORE', ]
roads_downtown_core <- sg_roads[st_intersection(sg_roads, downtown_core), ]
origin_downtown_core <- joined_origin[st_intersection(joined_origin, downtown_core), ]
tmap_mode("view")
tm_basemap("OpenStreetMap")+
tm_shape(downtown_core) +
tm_borders(lwd=0.5) +
tm_shape(roads_downtown_core) +
tm_lines() +
tm_shape(origin_downtown_core) +
tm_dots()Compared to the sparse distribution of points previously for CWC, the pick-up points seem to be distributed all over the roads in CBD.
tmap_mode("plot")Kernel Density Estimation
Comparing Spatial Point Patterns using KDE
Here, we shall compare the kernel density estimation of pick-up and drop-off points at the pre-selected planning areas.
Create owin object
Convert the SpatialPolygons objects into owin objects as required by spatstat.
tm_owin <- as.owin(tampines)
wd_owin <- as.owin(woodlands)
dt_owin <- as.owin(downtown_core)
cwc_owin <- as.owin(cwc)Combining points and the interested areas
Extract the pick-up and drop-off points within each of the study planning areas.
Interested Starting Locations
origin_tm_ppp = origin_ppp[tm_owin]
origin_wd_ppp = origin_ppp[wd_owin]
origin_dt_ppp = origin_ppp[dt_owin]
originstandardGeneric for "origin" defined from package "terra"
function (x, ...)
standardGeneric("origin")
<bytecode: 0x0000019105c169a0>
<environment: 0x0000019105c2e9b8>
Methods may be defined for arguments: x
Use showMethods(origin) for currently available ones.
Using origin_ppp or originSG_ppp does not matter as the restriction to the various planning area object windows will extract the necessary points.
Let’s use rescale() to transform the unit of measurement from metres to kilometres.
origin_tm_ppp.km = rescale(origin_tm_ppp, 1000, "km")
origin_wd_ppp.km = rescale(origin_wd_ppp, 1000, "km")
origin_dt_ppp.km = rescale(origin_dt_ppp, 1000, "km")Then look at the output.
par(mfrow=c(2,2))
par(mar = c(3,3,2,1))
plot(origin_tm_ppp.km, main= "Origin Tampines")
plot(origin_wd_ppp.km, main= "Origin Woodlands")
plot(origin_dt_ppp.km, main= "Origin Downtown Core")
Repeat the same steps for the interested ending locations!
Interested Ending Locations
dest_tm_ppp = dest_ppp[tm_owin]
dest_wd_ppp = dest_ppp[wd_owin]
dest_dt_ppp = dest_ppp[dt_owin]
dest_cwc_ppp = dest_ppp[cwc_owin]Let’s use rescale() to transform the unit of measurement from metres to kilometres.
dest_tm_ppp.km = rescale(dest_tm_ppp, 1000, "km")
dest_wd_ppp.km = rescale(dest_wd_ppp, 1000, "km")
dest_dt_ppp.km = rescale(dest_dt_ppp, 1000, "km")
dest_cwc_ppp.km = rescale(dest_cwc_ppp, 1000, "km")par(mfrow=c(2,2))
par(mar = c(3,3,2,1))
plot(dest_tm_ppp, main= "Dest Tampines")
plot(dest_wd_ppp, main= "Dest Woodlands")
plot(dest_dt_ppp, main= "Dest Downtown Core")
plot(dest_cwc_ppp, main="Dest CWC")
Bandwidth Methods
There are multiple methods in determining the bandwidth type to use for kernel density. Here, let’s experiment with the different bandwidth methods before proceeding further.
Automatic Bandwidth Selection
The purpose for this project is to visualise where the pick-up and drop-off points are clustered. Let’s observe the cluster visualisations produced by the various bandwidth functions.
Interested Starting Locations
Tampines
bw.CvL
plot(density(origin_tm_ppp.km,
sigma=bw.CvL,
edge=TRUE,
kernel="gaussian"),
main="Tamp Origin CvL")
bw.scott
plot(density(origin_tm_ppp.km,
sigma=bw.scott,
edge=TRUE,
kernel="gaussian"),
main="Tamp Origin scott")
bw.diggle
plot(density(origin_tm_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="gaussian"),
main="Tamp Origin diggle")
bw.ppl
plot(density(origin_tm_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Tamp Origin ppl")
- From the visualisations of Tampines, let’s go ahead with bw.diggle for the Tampines starting locations.
Woodlands
plot(density(origin_wd_ppp.km,
sigma=bw.CvL,
edge=TRUE,
kernel="gaussian"),
main="Woodlands Origin CvL")
plot(density(origin_wd_ppp.km,
sigma=bw.scott,
edge=TRUE,
kernel="gaussian"),
main="Woodlands Origin scott")
plot(density(origin_wd_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="gaussian"),
main="Woodlands Origin diggle")
plot(density(origin_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Woodlands Origin ppl")
- The bw.ppl method can be used as it shows better visualisation of predominant tight clusters in Woodlands.
Downtown Core
plot(density(origin_dt_ppp.km,
sigma=bw.CvL,
edge=TRUE,
kernel="gaussian"),
main="Downtown Origin CvL")
plot(density(origin_dt_ppp.km,
sigma=bw.scott,
edge=TRUE,
kernel="gaussian"),
main="Downtown Origin scott")
plot(density(origin_dt_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="gaussian"),
main="Downtown Origin")
plot(density(origin_dt_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Downtown Origin ppl")
- The bw.ppl method also seems to produce the best visualisation for CBD Downtown Core.
Interested Ending Locations
Tampines
plot(density(dest_tm_ppp.km,
sigma=bw.CvL,
edge=TRUE,
kernel="gaussian"),
main="Tampines Dest CvL")
plot(density(dest_tm_ppp.km,
sigma=bw.scott,
edge=TRUE,
kernel="gaussian"),
main="Tampines Dest scott")
plot(density(dest_tm_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="gaussian"),
main="Tampines Dest diggle")
plot(density(dest_tm_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Tampines Dest ppl")
- In my opinion, the bw.ppl method showcases tighter clusters better than bw.diggle. Hence the bw.diggle method will be used for the Tampines drop-off locations.
Woodlands
plot(density(dest_wd_ppp.km,
sigma=bw.CvL,
edge=TRUE,
kernel="gaussian"),
main="Woodlands Dest CvL")
plot(density(dest_wd_ppp.km,
sigma=bw.scott,
edge=TRUE,
kernel="gaussian"),
main="Woodlands Dest scott")
plot(density(dest_wd_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="gaussian"),
main="Woodlands Dest diggle")
plot(density(dest_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Woodlands Dest ppl")
- The bw.ppl method seems to work best for the Woodlands drop-off locations.
Central Water Catchment :
:: {.panel-tabset}
bw.CvL
plot(density(dest_cwc_ppp.km,
sigma=bw.CvL,
edge=TRUE,
kernel="gaussian"),
main="CWC Dest CvL")
bw.scott
plot(density(dest_cwc_ppp.km,
sigma=bw.scott,
edge=TRUE,
kernel="gaussian"),
main="CWC Dest scott")
bw.diggle
plot(density(dest_cwc_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="gaussian"),
main="CWC Dest diggle")
bw.ppl
plot(density(dest_cwc_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="CWC Dest ppl")
:::
- Again, the bw.ppl method seems to work best for the Central Water Catchment drop-off points.
To summarise the automatic bandwidth selection process, the bw.ppl method is chosen for all areas of interest except for Tampines Pick-ups.
Baddeley et al. (2016) suggested the use of bw.ppl() algorithm because in their experience, the algorithm tends to produce the more appropriate values when the pattern consists predominantly tight clusters.
However, they also insist that if the purpose is to detect a single tight cluster in the midst of random noise then bw.diggle() is the best. Thus we can test for this using the Tampines Pick-ups.
Kernel Method
The different kernel methods are demonstrated below.
Interested Starting Locations
Tampines
plot(density(origin_tm_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="gaussian"),
main="Gaussian")
plot(density(origin_tm_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="epanechnikov"),
main="Quadratic")
plot(density(origin_tm_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="quartic"),
main="Quartic")
plot(density(origin_tm_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="disc"),
main="Disc")
- There are no significant differences observed across the Tampines Origin kernels, so we will use the default which is Gaussian.
Woodlands
plot(density(origin_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Gaussian")
plot(density(origin_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="epanechnikov"),
main="Quadratic")
plot(density(origin_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="quartic"),
main="Quartic")
plot(density(origin_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="disc"),
main="Disc")
- For Woodlands pick-ups, the disc kernel method seems best for visualisation.
Downtown Core (CBD)
:::{.panel-tabset}
Gaussian
plot(density(origin_dt_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Gaussian")
Quadratic (Epanechnikov)
plot(density(origin_dt_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="epanechnikov"),
main="Quadratic")
Quartic
plot(density(origin_dt_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="quartic"),
main="Quartic")
Disc
plot(density(origin_dt_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="disc"),
main="Disc")
- The visualisation outputs across the kernel methods are similar, hence we will adopt the default Gaussian kernel method for CBD pick-ups.
Interested Ending Locations
Tampines
plot(density(dest_tm_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Gaussian")
plot(density(dest_tm_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="epanechnikov"),
main="Epanechnikov")
plot(density(dest_tm_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="quartic"),
main="Quartic")
plot(density(dest_tm_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="disc"),
main="Disc")
- The visualisation outputs across the kernel methods are similar, hence we will adopt the default Gaussian kernel method for Tampines drop-offs.
Woodlands
plot(density(dest_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Gaussian")
plot(density(dest_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="epanechnikov"),
main="Epanechnikov")
plot(density(dest_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="quartic"),
main="Quartic")
plot(density(dest_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="disc"),
main="Disc")
- The visualisation outputs across the kernel methods are similar, hence we will adopt the default Gaussian kernel method for Woodlands drop-offs.
CWC
plot(density(dest_cwc_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian"),
main="Gaussian")
plot(density(dest_cwc_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="epanechnikov"),
main="Epanechnikov")
plot(density(dest_cwc_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="quartic"),
main="Quartic")
plot(density(dest_cwc_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="disc"),
main="Disc")
- The visualisation outputs across the kernel methods are similar, hence we will adopt the default Gaussian kernel method for Woodlands drop-offs.
A grand summary of what I have done so far:
| Part | Planning Area | Automatic Bandwidth | Kernel Method |
|---|---|---|---|
| Origin | Tampines | diggle | Gaussian |
| Origin | Woodlands | ppl | Disc |
| Origin | Downtown Core | ppl | Gaussian |
| Destination | Tampines | ppl | Gaussian |
| Destination | Woodlands | ppl | Gaussian |
| Destination | Central Water Catchment | ppl | Gaussian |
Final Kernel Density Estimation Maps
kde_origin_tm.bw <-
density(origin_tm_ppp.km,
sigma=bw.diggle,
edge=TRUE,
kernel="gaussian")
plot(kde_origin_tm.bw)
kde_origin_wd.bw <- density(origin_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="disc")
plot(kde_origin_wd.bw)
kde_origin_dt.bw <-
density(origin_dt_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian")
plot(kde_origin_dt.bw)
kde_dest_tm.bw <- density(dest_tm_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian")
plot(kde_dest_tm.bw)
kde_dest_wd.bw <- density(dest_wd_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian")
plot(kde_dest_wd.bw)
kde_dest_cwc.bw <- density(dest_cwc_ppp.km,
sigma=bw.ppl,
edge=TRUE,
kernel="gaussian")
plot(kde_dest_cwc.bw)
Converting Gridded Output into Raster
To display KDE maps on tmap, the maps neeed to be converted to rasters.
Converting to Spatial Grid Data Frame
First set the KDE maps into Spatial Grids.
gridded_kde_origin_tm_bw <- as(kde_origin_tm.bw, "SpatialGridDataFrame")
gridded_kde_origin_wd_bw <- as(kde_origin_wd.bw, "SpatialGridDataFrame")
gridded_kde_origin_dt_bw <- as(kde_origin_dt.bw, "SpatialGridDataFrame")
gridded_kde_dest_tm_bw <- as(kde_dest_tm.bw, "SpatialGridDataFrame")
gridded_kde_dest_wd_bw <- as(kde_dest_wd.bw, "SpatialGridDataFrame")
gridded_kde_dest_cwc_bw <- as(kde_dest_cwc.bw, "SpatialGridDataFrame")View the gridded outputs.
spplot(gridded_kde_origin_tm_bw)
spplot(gridded_kde_origin_wd_bw)
spplot(gridded_kde_origin_dt_bw)
spplot(gridded_kde_dest_tm_bw)
spplot(gridded_kde_dest_wd_bw)
spplot(gridded_kde_dest_cwc_bw)
Rasterisation of Grid Outputs
Convert the gridded outputs into rasters.
kde_origin_tm_bw_raster <- raster(gridded_kde_origin_tm_bw)
kde_origin_wd_bw_raster <- raster(gridded_kde_origin_wd_bw)
kde_origin_dt_bw_raster <- raster(gridded_kde_origin_dt_bw)
kde_dest_tm_bw_raster <- raster(gridded_kde_dest_tm_bw)
kde_dest_wd_bw_raster <- raster(gridded_kde_dest_wd_bw)
kde_dest_cwc_bw_raster <- raster(gridded_kde_dest_cwc_bw)Assigning Projection Systems
Now include the CRS information.
projection(kde_origin_tm_bw_raster) <- CRS("+init=EPSG:3414")
projection(kde_origin_wd_bw_raster) <- CRS("+init=EPSG:3414")
projection(kde_origin_dt_bw_raster) <- CRS("+init=EPSG:3414")
projection(kde_dest_tm_bw_raster) <- CRS("+init=EPSG:3414")
projection(kde_dest_wd_bw_raster) <- CRS("+init=EPSG:3414")
projection(kde_dest_cwc_bw_raster) <- CRS("+init=EPSG:3414")Visualise tmap Outputs
tmap_mode("plot")
tm_shape(kde_origin_tm_bw_raster) +
tm_raster("v") +
tm_layout(legend.position = c("left","bottom"),frame=FALSE)
tmap_mode("plot")
tm_shape(kde_origin_wd_bw_raster) +
tm_raster("v") +
tm_layout(legend.position = c("right","top"),frame=FALSE)
tmap_mode("plot")
tm_shape(kde_origin_dt_bw_raster) +
tm_raster("v") +
tm_layout(legend.position = c("right","bottom"),frame=FALSE)
tmap_mode("plot")
tm_shape(kde_dest_tm_bw_raster) +
tm_raster("v") +
tm_layout(legend.position = c("left","bottom"),frame=FALSE)
tmap_mode("plot")
tm_shape(kde_dest_wd_bw_raster) +
tm_raster("v") +
tm_layout(legend.position = c("left","top"),frame=FALSE)
tmap_mode("plot")
tm_shape(kde_dest_cwc_bw_raster) +
tm_raster("v") +
tm_layout(legend.position = c("left","bottom"),frame=FALSE)
Network-Constrained Kernel Density Estimation (NKDE) & Analysis
Previously, I have already extracted out the roads for Tampines (roads_tampines), Woodlands (roads_woodlands), Downtown Core (roads_downtown_core) and Central Water Catchment (roads_cwc).
Segmentation & Snapping
To perform NKDE, events like pick-up and drop-off points need to be snapped onto the road network. This will enable the lines to gain the necessary spatial weights based on point events.
Create the line segments for all the road networks of our areas of interest.
unique_types <- unique(st_geometry_type(roads_tampines))
# Filter roads_tampines to keep only linestrings
if ("LINESTRING" %in% unique_types) {
roads_tampines <- st_cast(roads_tampines, "LINESTRING")
} else {
# Handle the case when no linestrings are found
stop("No linestrings found in roads_tampines.")
}
lixels_tm <- lixelize_lines(roads_tampines,
80,
mindist=40)
samples_tm <- lines_center(lixels_tm)
densities_tm_origin <- nkde(roads_tampines,
events = origin_tampines,
w = rep(1,nrow(origin_tampines)),
samples = samples_tm,
kernel_name = "quartic",
bw = 300,
div= "bw",
method = "simple",
digits = 1,
tol = 1,
grid_shape = c(1,1),
max_depth = 8,
agg = 5, #we aggregate events within a 5m radius (faster calculation)
sparse = TRUE,
verbose = FALSE)
samples_tm$density_origin <- densities_tm_origin
lixels_tm$density_origin <- densities_tm_origin
samples_tm$density_origin <- samples_tm$density_origin*1000
lixels_tm$density_origin <- lixels_tm$density_origin*1000
tmap_mode("view")
tm_shape(lixels_tm)+
tm_lines(col="density_origin")+
tm_shape(origin_tampines)+
tm_dots(alpha=0.1)